最近,神经主题模型(NTMS)已纳入预训练的语言模型(PLM)中,以捕获用于文本摘要的全局语义信息。但是,在这些方法中,它们捕获和整合全局语义信息的方式仍然存在局限性。在本文中,我们提出了一个新颖的模型,即图形对比主题增强语言模型(GRETEL),该模型将图形对比主题模型与预训练的语言模型结合在一起,以充分利用长文档提取的全球和本地上下文语义摘要。为了更好地捕获并将全局语义信息纳入PLM,图形对比主题模型集成了层次变压器编码器和图形对比度学习,以从全局文档上下文和金摘要中融合语义信息。为此,Gretel鼓励该模型有效提取与黄金摘要有关的显着句子,而不是涵盖亚最佳主题的多余句子。对通用域和生物医学数据集的实验结果表明,我们所提出的方法优于SOTA方法。
translated by 谷歌翻译
点击率(CTR)预测旨在估算用户单击项目的可能性,是在线广告的重要组成部分。现有方法主要尝试从用户的历史行为中挖掘用户兴趣,这些行为包含用户直接交互的项目。尽管这些方法取得了长足的进步,但通常会受到推荐系统的直接曝光和不活动相互作用的限制,因此无法挖掘所有潜在的用户利益。为了解决这些问题,我们提出了基于邻居相互作用的CTR预测(NI-CTR),该预测在异质信息网络(HIN)设置下考虑此任务。简而言之,基于邻居相互作用的CTR预测涉及HIN目标用户项目对的本地邻域以预测其链接。为了指导当地社区的表示形式,我们从显式和隐性的角度考虑了本地邻里节点之间的不同类型的相互作用,并提出了一种新颖的图形掩盖变压器(GMT),以有效地将这些类型的交互结合到为目标用户项目对生成高度代表性的嵌入。此外,为了提高针对邻居采样的模型鲁棒性,我们在嵌入邻里的嵌入式上执行了一致性正规化损失。我们对数百万个实例进行了两个现实世界数据集进行了广泛的实验,实验结果表明,我们所提出的方法的表现明显优于最先进的CTR模型。同时,全面的消融研究验证了我们模型每个组成部分的有效性。此外,我们已经在具有数十亿用户的微信官方帐户平台上部署了此框架。在线A/B测试表明,针对所有在线基线的平均CTR改进为21.9。
translated by 谷歌翻译
Machine learning models are typically evaluated by computing similarity with reference annotations and trained by maximizing similarity with such. Especially in the bio-medical domain, annotations are subjective and suffer from low inter- and intra-rater reliability. Since annotations only reflect the annotation entity's interpretation of the real world, this can lead to sub-optimal predictions even though the model achieves high similarity scores. Here, the theoretical concept of Peak Ground Truth (PGT) is introduced. PGT marks the point beyond which an increase in similarity with the reference annotation stops translating to better Real World Model Performance (RWMP). Additionally, a quantitative technique to approximate PGT by computing inter- and intra-rater reliability is proposed. Finally, three categories of PGT-aware strategies to evaluate and improve model performance are reviewed.
translated by 谷歌翻译
The number of international benchmarking competitions is steadily increasing in various fields of machine learning (ML) research and practice. So far, however, little is known about the common practice as well as bottlenecks faced by the community in tackling the research questions posed. To shed light on the status quo of algorithm development in the specific field of biomedical imaging analysis, we designed an international survey that was issued to all participants of challenges conducted in conjunction with the IEEE ISBI 2021 and MICCAI 2021 conferences (80 competitions in total). The survey covered participants' expertise and working environments, their chosen strategies, as well as algorithm characteristics. A median of 72% challenge participants took part in the survey. According to our results, knowledge exchange was the primary incentive (70%) for participation, while the reception of prize money played only a minor role (16%). While a median of 80 working hours was spent on method development, a large portion of participants stated that they did not have enough time for method development (32%). 25% perceived the infrastructure to be a bottleneck. Overall, 94% of all solutions were deep learning-based. Of these, 84% were based on standard architectures. 43% of the respondents reported that the data samples (e.g., images) were too large to be processed at once. This was most commonly addressed by patch-based training (69%), downsampling (37%), and solving 3D analysis tasks as a series of 2D tasks. K-fold cross-validation on the training set was performed by only 37% of the participants and only 50% of the participants performed ensembling based on multiple identical models (61%) or heterogeneous models (39%). 48% of the respondents applied postprocessing steps.
translated by 谷歌翻译
We test grip strength and shock absorption properties of various granular material in granular jamming robotic components. The granular material comprises a range of natural, manufactured, and 3D printed material encompassing a wide range of shapes, sizes, and Shore hardness. Two main experiments are considered, both representing compelling use cases for granular jamming in soft robotics. The first experiment measures grip strength (retention force measured in Newtons) when we fill a latex balloon with the chosen grain type and use it as a granular jamming gripper to pick up a range of test objects. The second experiment measures shock absorption properties recorded by an Inertial Measurement Unit which is suspended in an envelope of granular material and dropped from a set height. Our results highlight a range of shape, size and softness effects, including that grain deformability is a key determinant of grip strength, and interestingly, that larger grain sizes in 3D printed grains create better shock absorbing materials.
translated by 谷歌翻译
Accurate uncertainty measurement is a key step to building robust and reliable machine learning systems. Conformal prediction is a distribution-free uncertainty quantification algorithm popular for its ease of implementation, statistical coverage guarantees, and versatility for underlying forecasters. However, existing conformal prediction algorithms for time series are limited to single-step prediction without considering the temporal dependency. In this paper we propose a Copula Conformal Prediction algorithm for multivariate, multi-step Time Series forecasting, CopulaCPTS. On several synthetic and real-world multivariate time series datasets, we show that CopulaCPTS produces more calibrated and sharp confidence intervals for multi-step prediction tasks than existing techniques.
translated by 谷歌翻译
For conceptual design, engineers rely on conventional iterative (often manual) techniques. Emerging parametric models facilitate design space exploration based on quantifiable performance metrics, yet remain time-consuming and computationally expensive. Pure optimisation methods, however, ignore qualitative aspects (e.g. aesthetics or construction methods). This paper provides a performance-driven design exploration framework to augment the human designer through a Conditional Variational Autoencoder (CVAE), which serves as forward performance predictor for given design features as well as an inverse design feature predictor conditioned on a set of performance requests. The CVAE is trained on 18'000 synthetically generated instances of a pedestrian bridge in Switzerland. Sensitivity analysis is employed for explainability and informing designers about (i) relations of the model between features and/or performances and (ii) structural improvements under user-defined objectives. A case study proved our framework's potential to serve as a future co-pilot for conceptual design studies of pedestrian bridges and beyond.
translated by 谷歌翻译
The 1$^{\text{st}}$ Workshop on Maritime Computer Vision (MaCVi) 2023 focused on maritime computer vision for Unmanned Aerial Vehicles (UAV) and Unmanned Surface Vehicle (USV), and organized several subchallenges in this domain: (i) UAV-based Maritime Object Detection, (ii) UAV-based Maritime Object Tracking, (iii) USV-based Maritime Obstacle Segmentation and (iv) USV-based Maritime Obstacle Detection. The subchallenges were based on the SeaDronesSee and MODS benchmarks. This report summarizes the main findings of the individual subchallenges and introduces a new benchmark, called SeaDronesSee Object Detection v2, which extends the previous benchmark by including more classes and footage. We provide statistical and qualitative analyses, and assess trends in the best-performing methodologies of over 130 submissions. The methods are summarized in the appendix. The datasets, evaluation code and the leaderboard are publicly available at https://seadronessee.cs.uni-tuebingen.de/macvi.
translated by 谷歌翻译
Many high-level skills that are required for computer vision tasks, such as parsing questions, comparing and contrasting semantics, and writing descriptions, are also required in other domains such as natural language processing. In this paper, we ask whether this makes it possible to learn those skills from text data and then use them to complete vision tasks without ever training on visual training data. Key to our approach is exploiting the joint embedding space of contrastively trained vision and language encoders. In practice, there can be systematic differences between embedding spaces for different modalities in contrastive models, and we analyze how these differences affect our approach and study a variety of strategies to mitigate this concern. We produce models using only text training data on three tasks: image captioning, visual entailment and visual question answering, and evaluate them on standard benchmarks using images. We find that this kind of transfer is possible and results in only a small drop in performance relative to models trained on images. We also showcase a variety of stylistic image captioning models that were trained using no image data and no human-curated language data, but instead text data from books, the web, or language models.
translated by 谷歌翻译
We describe the AGReE system, which takes user-submitted passages as input and automatically generates grammar practice exercises that can be completed while reading. Multiple-choice practice items are generated for a variety of different grammar constructs: punctuation, articles, conjunctions, pronouns, prepositions, verbs, and nouns. We also conducted a large-scale human evaluation with around 4,500 multiple-choice practice items. We notice for 95% of items, a majority of raters out of five were able to identify the correct answer and for 85% of cases, raters agree that there is only one correct answer among the choices. Finally, the error analysis shows that raters made the most mistakes for punctuation and conjunctions.
translated by 谷歌翻译